AITopics | information profile

Collaborating Authors

information profile

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Error Bounds and Optimal Schedules for Masked Diffusions with Factorized Approximations

Lavenant, Hugo, Zanella, Giacomo

arXiv.org Machine LearningOct-30-2025

Recently proposed generative models for discrete data, such as Masked Diffusion Models (MDMs), exploit conditional independence approximations to reduce the computational cost of popular Auto-Regressive Models (ARMs), at the price of some bias in the sampling distribution. We study the resulting computation-vs-accuracy trade-off, providing general error bounds (in relative entropy) that depend only on the average number of tokens generated per iteration and are independent of the data dimensionality (i.e. sequence length), thus supporting the empirical success of MDMs. We then investigate the gain obtained by using non-constant schedule sizes (i.e. varying the number of unmasked tokens during the generation process) and identify the optimal schedule as a function of a so-called information profile of the data distribution, thus allowing for a principled optimization of schedule sizes. We define methods directly as sampling algorithms and do not use classical derivations as time-reversed diffusion processes, leading us to simple and transparent proofs.

artificial intelligence, information profile, machine learning, (15 more...)

arXiv.org Machine Learning

2510.25544

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

0c74b7f78409a4022a2c4c5a5ca3ee19-Paper.pdf

Neural Information Processing SystemsMar-15-2024, 15:44:21 GMT

Languages vary widely in many ways, including their canonical word order. A basic aspect of the observed variation is the fact that some word orders are much more common than others. Although this regularity has been recognized for some time, it has not been well-explained. In this paper we offer an informationtheoretic explanation for the observed word-order distribution across languages, based on the concept of Uniform Information Density (UID). We suggest that object-first languages are particularly disfavored because they are highly nonoptimal if the goal is to distribute information content approximately evenly throughout a sentence, and that the rest of the observed word-order distribution is at least partially explainable in terms of UID. We support our theoretical analysis with data from child-directed speech and experimental work.

hypothesis, information, word order, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Bergen County > Mahwah (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Unsupervised pre-training helps to conserve views from input distribution

Pinchaud, Nicolas

arXiv.org Machine LearningMay-30-2019

We investigate the effects of the unsupervised pre-training method under the perspective of information theory. If the input distribution displays multiple views of the supervision, then unsupervised pre-training allows to learn hierarchical representation which communicates these views across layers, while disentangling the supervision. Disentanglement of supervision leads learned features to be independent conditionally to the label. In case of binary features, we show that conditional independence allows to extract label's information with a linear model and therefore helps to solve under-fitting. We suppose that representations displaying multiple views help to solve over-fitting because each view provides information that helps to reduce model's variance. We propose a practical method to measure both disentanglement of supervision and quantity of views within a binary representation. We show that unsupervised pre-training helps to conserve views from input distribution, whereas representations learned using supervised models disregard most of them.

artificial intelligence, information, machine learning, (18 more...)

arXiv.org Machine Learning

1905.12889

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Why are some word orders more common than others? A uniform information density account

Maurits, Luke, Navarro, Dan, Perfors, Amy

Neural Information Processing SystemsDec-31-2010

Languages vary widely in many ways, including their canonical word order. A basic aspect of the observed variation is the fact that some word orders are much more common than others. Although this regularity has been recognized for some time, it has not been well-explained. In this paper we offer an information-theoretic explanation for the observed word-order distribution across languages, based on the concept of Uniform Information Density (UID). We suggest that object-first languages are particularly disfavored because they are highly non-optimal if the goal is to distribute information content approximately evenly throughout a sentence, and that the rest of the observed word-order distribution is at least partially explainable in terms of UID. We support our theoretical analysis with data from child-directed speech and experimental work.

artificial intelligence, natural language, word order, (19 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback